Topic : An Analysis of Food Security in USA - 2021


Our team’s research topic is the situation of food security in 2021. We want to know how different demographic and socioeconomic factors relate to food security.

We are using the Current Population Survey - Food Security Supplement Dec 2021 data provided by the US Census Bureau

The Dataset contains 507 variables and roughly 120,000 observations


The Smart Question we have proposed and hope to answer are

Specific:- To study the specific pattern shown in the data that affects food security such as states, counties, income level, whether the family uses SNAP, race, immigrant status, work status, education level and many more demographic, socio-economic variables.

Measurable: Use EDA techniques to know how significantly different factors contribute to food insecurity.

Achievable: Can find variables which are significantly affecting food insecurity and can create models for ensuring food security in households.

Relevant: Food being the basic requirement of any human, this study can shed light on what the authorities and we ourselves can do in order to eradicate food insecurity.

Time-oriented: Data set for the month of December 2021 is considered for the study so that it can also show the effect of Covid-19 in food security.


Considering the Questions we are asking, we have decided to select just 11 factors to work on

A very significant limitation to our data is that we have trimmed off a lot of observations where either the interview was not taken or not completed. Ideally we should account for these observations somehow, but due to time constraints we aren’t doing that


## 'data.frame':    71472 obs. of  12 variables:
##  $ Id                : Factor w/ 27922 levels "5185410966","8178510165",..: 16600 9378 9378 8472 8472 7861 7861 19375 19375 24604 ...
##  $ States            : Factor w/ 51 levels "1","2","4","5",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ Family_Size       : Factor w/ 14 levels "1","2","3","4",..: 1 2 2 2 2 2 2 2 2 1 ...
##  $ Household_Income  : Factor w/ 16 levels "1","2","3","4",..: 16 14 14 12 12 13 13 9 9 11 ...
##  $ SNAP              : Factor w/ 5 levels "-3","-2","-1",..: 3 3 3 5 5 3 3 5 5 3 ...
##  $ Ethnicity         : Factor w/ 24 levels "1","2","3","4",..: 1 1 1 1 1 1 1 2 2 1 ...
##  $ Citizenship_status: Factor w/ 5 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ Number_of_Jobs    : Factor w/ 4 levels "-1","2","3","4": 1 1 1 1 1 1 1 1 1 1 ...
##  $ Hours_on_Jobs     : Factor w/ 88 levels "-4","-1","0",..: 67 43 2 43 43 62 43 2 2 2 ...
##  $ Education_Level   : Factor w/ 17 levels "-1","31","32",..: 14 15 1 14 14 10 10 7 10 5 ...
##  $ FoodSecurity_score: Factor w/ 4 levels "1","2","3","4": 1 1 1 1 1 1 1 2 2 1 ...
##  $ PRNMCHLD          : Factor w/ 12 levels "0","1","2","3",..: 1 2 1 1 1 1 1 1 1 1 ...

Coming to our Response Variable, Food Security


High Food Security: No reported indications of food-access problems or limitations.

Marginal Food Security: One or two reported signs, usually anxiety over food availability or scarcity in the home. There is little to no evidence that diets or food intake have changed.

Low Food Security: One or two reported signs, usually indicating worry about food scarcity or insufficiency at home. Little to no evidence of dietary or food intake changes.

Very Low Food Security: Reports of numerous signs of altered eating habits and decreased food intake.


captioncaption

caption

Ethnicity


  • We have 25 Ethnicities in this Data. We will explore the relationship between Ethnicity and Food Security graphically, and do some statistical testing to confrim the said relationship

Plotting barcharts between all types of Ethinicity and food security status
captioncaptioncaptioncaptioncaptioncaptioncaptioncaptioncaptioncaptioncaptioncaptioncaptioncaptioncaptioncaptioncaptioncaptioncaptioncaptioncaptioncaptioncaptioncaption

caption

Statistical Testing


  • We are going to be Using Fisher’s Exact Test instead of Chi-square test because of the numerous levels with low frequency of observations

  • Our Null Hypothesis is that Ethnicity and Food Security Status are Independent of each other.

  • Taking our alpha to be 5%


## 
##  Fisher's Exact Test for Count Data with simulated p-value (based on
##  2000 replicates)
## 
## data:  FS_Subset$Ethnicity and FS_Subset$FoodSecurity_score
## p-value = 0.0004998
## alternative hypothesis: two.sided

  • Since, the P-Value is less than our taken alpha we can say that there is a statistically significant relationship between Ethnicity and Food Security

Citizenship


  • We have 5 levels of Citizenship status in this Data. We will explore the relationship between Citizenship and Food Security graphically, and do some statistical testing to confrim the said relationship

Plotting barcharts between different Citizenship status’s and food security status
captioncaptioncaptioncaptioncaption

caption

Statistical Testing


  • We are Chi-square test

  • Our Null Hypothesis is that Citizenship Status and Food Security Status are Independent of each other.

  • Taking our alpha to be 5%


## 
##  Pearson's Chi-squared test
## 
## data:  FS_Subset$Citizenship_status and FS_Subset$FoodSecurity_score
## X-squared = 437.62, df = 12, p-value < 2.2e-16

  • Since, the P-Value is less than our taken alpha we can say that there is a statistically significant relationship between Citizenship Status and Food Security

SNAP


  • SNAP Stands for Supplemental Nutrition Assistance Program
  • SNAP factor had 5 levels to in this dataset
  • We have dropped a level which indicated that the observations are not in universe

Plotting Stacked barcharts between SNAP status and food security status
captioncaption

caption

Statistical Testing


  • We are Chi-square test

  • Our Null Hypothesis is that SNAP Status and Food Security Status are Independent of each other.

  • Taking our alpha to be 5%


## 
##  Pearson's Chi-squared test
## 
## data:  chi_test_SNAP
## X-squared = 764.1, df = 3, p-value < 2.2e-16

  • Since, the P-Value is less than our taken alpha we can say that there is a statistically significant relationship between SNAP Status and Food Security

Odds Ratio

##              Outcome +    Outcome -      Total        Inc risk *        Odds
## Exposed +         4471         2737       7208              62.0        1.63
## Exposed -        14258         4008      18266              78.1        3.56
## Total            18729         6745      25474              73.5        2.78
## 
## Point estimates and 95% CIs:
## -------------------------------------------------------------------
## Inc risk ratio                                 0.79 (0.78, 0.81)
## Odds ratio                                     0.46 (0.43, 0.49)
## Attrib risk in the exposed *                   -16.03 (-17.30, -14.76)
## Attrib fraction in the exposed (%)            -25.84 (-28.34, -23.40)
## Attrib risk in the population *                -4.54 (-5.34, -3.73)
## Attrib fraction in the population (%)         -6.17 (-6.68, -5.66)
## -------------------------------------------------------------------
## Uncorrected chi2 test that OR = 1: chi2(1) = 682.162 Pr>chi2 = <0.001
## Fisher exact test that OR = 1: Pr>chi2 = <0.001
##  Wald confidence limits
##  CI: confidence interval
##  * Outcomes per 100 population units

  • The Odds ratio is 0.46, with a 95% Confidence Interval
  • This means that the odds of a person not on SNAP to be food secure is 2.17 times the odds of person on SNAP to be food secure.

##                 Id Number_of_Jobs Hours_on_Jobs Education_Level
## 1  404006407110031             -1            65              43
## 7  147240092351000             -1            40              44
## 8  147240092351000             -1            -1              -1
## 16 128450301231000             -1            40              43
## 17 128450301231000             -1            40              43
## 18 114580195861000             -1            60              39
##    FoodSecurity_score
## 1  High Food Security
## 7  High Food Security
## 8  High Food Security
## 16 High Food Security
## 17 High Food Security
## 18 High Food Security
## 'data.frame':    71472 obs. of  5 variables:
##  $ Id                : Factor w/ 27922 levels "5185410966","8178510165",..: 16600 9378 9378 8472 8472 7861 7861 19375 19375 24604 ...
##  $ Number_of_Jobs    : Factor w/ 4 levels "-1","2","3","4": 1 1 1 1 1 1 1 1 1 1 ...
##  $ Hours_on_Jobs     : Factor w/ 88 levels "-4","-1","0",..: 67 43 2 43 43 62 43 2 2 2 ...
##  $ Education_Level   : Factor w/ 17 levels "-1","31","32",..: 14 15 1 14 14 10 10 7 10 5 ...
##  $ FoodSecurity_score: Factor w/ 4 levels "High Food Security",..: 1 1 1 1 1 1 1 2 2 1 ...

Renaming Education level, Number of Hours on Job per week and Number of Jobs

##                           Not Applicable 
##                                    12558 
##                      Less Than 1st Grade 
##                                      144 
##               1st, 2nd, 3rd Or 4th Grade 
##                                      251 
##                         5th Or 6th Grade 
##                                      446 
##                         7th Or 8th Grade 
##                                      912 
##                                9th Grade 
##                                     1241 
##                               10th Grade 
##                                     1538 
##                               11th Grade 
##                                     1625 
##                    12th Grade No Diploma 
##                                      910 
##  High School Grad-Diploma Or Equiv (Ged) 
##                                    16004 
##               Some College But No Degree 
##                                     9492 
## Associate Degree-Occupational/Vocational 
##                                     2454 
##        Associate Degree-Academic Program 
##                                     3257 
##                         Bachelors Degree 
##                                    12871 
##                           Masters Degree 
##                                     5720 
##                  Professional School Deg 
##                                      858 
##                         Doctorate Degree 
##                                     1191
## Not Applicable         2 Jobs         3 Jobs 4 or more jobs 
##          69584           1706            159             23
## 'data.frame':    71472 obs. of  5 variables:
##  $ Id                : Factor w/ 27922 levels "5185410966","8178510165",..: 16600 9378 9378 8472 8472 7861 7861 19375 19375 24604 ...
##  $ Number_of_Jobs    : Factor w/ 4 levels "Not Applicable",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ Hours_on_Jobs     : num  67 43 2 43 43 62 43 2 2 2 ...
##  $ Education_Level   : Factor w/ 17 levels "Not Applicable",..: 14 15 1 14 14 10 10 7 10 5 ...
##  $ FoodSecurity_score: Factor w/ 4 levels "High Food Security",..: 1 1 1 1 1 1 1 2 2 1 ...

Number of Jobs

Here * 1 High Food Security * 2 Marginal Food Security * 3 Low Food Security * 4 Very Low Food Security * -9 No Response

In the above graphs, people say that they have High Food Security irrespective of the number of jobs. But lets use Chi-Square test to see if they are really independent of each other

## 'data.frame':    1888 obs. of  5 variables:
##  $ Id                : Factor w/ 1734 levels "13041104291",..: 769 35 419 512 1326 1244 535 222 82 974 ...
##  $ Number_of_Jobs    : Factor w/ 3 levels "2 Jobs","3 Jobs",..: 1 1 1 1 1 2 1 1 1 2 ...
##  $ Hours_on_Jobs     : num  38 43 33 48 40 13 23 33 13 43 ...
##  $ Education_Level   : Factor w/ 15 levels "1st, 2nd, 3rd Or 4th Grade",..: 13 12 10 13 8 12 13 12 8 13 ...
##  $ FoodSecurity_score: Factor w/ 4 levels "High Food Security",..: 1 1 1 1 1 1 1 1 1 4 ...
## 'data.frame':    1888 obs. of  5 variables:
##  $ Id                : Factor w/ 27922 levels "5185410966","8178510165",..: 12278 466 6883 8428 21354 20006 8726 3657 1472 15457 ...
##  $ Number_of_Jobs    : Factor w/ 4 levels "Not Applicable",..: 2 2 2 2 2 3 2 2 2 3 ...
##  $ Hours_on_Jobs     : num  38 43 33 48 40 13 23 33 13 43 ...
##  $ Education_Level   : Factor w/ 17 levels "Not Applicable",..: 15 14 12 15 10 14 15 14 10 15 ...
##  $ FoodSecurity_score: Factor w/ 4 levels "High Food Security",..: 1 1 1 1 1 1 1 1 1 4 ...
Table
High Food Security Marginal Food Security Low Food Security Very Low Food Security
2 Jobs 1425 115 112 54
3 Jobs 136 7 7 9
4 or more jobs 19 2 0 2
## 
##  Pearson's Chi-squared test
## 
## data:  contable_number_of_jobs
## X-squared = 8.4874, df = 6, p-value = 0.2045

The result gave warnings as the estimated value for some cells are very low. From the test, we see that the P-value for the Chi-square test is 0.3871 which is greater than the default value 0.05. Hence we accept the null hypothesis and hence, Number of Jobs doesn’t significantly affect the Food Security.

###Education Level

## 'data.frame':    58914 obs. of  5 variables:
##  $ Id                : Factor w/ 27922 levels "5185410966","8178510165",..: 16600 9378 8472 8472 7861 7861 19375 19375 24604 32 ...
##  $ Number_of_Jobs    : Factor w/ 4 levels "Not Applicable",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ Hours_on_Jobs     : num  67 43 43 43 62 43 2 2 2 43 ...
##  $ Education_Level   : Factor w/ 16 levels "Less Than 1st Grade",..: 13 14 13 13 9 9 6 9 4 13 ...
##  $ FoodSecurity_score: Factor w/ 4 levels "High Food Security",..: 1 1 1 1 1 1 2 2 1 1 ...
##                                        Var1  Freq
## 1                       Less Than 1st Grade   144
## 2                1st, 2nd, 3rd Or 4th Grade   251
## 3                          5th Or 6th Grade   446
## 4                          7th Or 8th Grade   912
## 5                                 9th Grade  1241
## 6                                10th Grade  1538
## 7                                11th Grade  1625
## 8                     12th Grade No Diploma   910
## 9   High School Grad-Diploma Or Equiv (Ged) 16004
## 10               Some College But No Degree  9492
## 11 Associate Degree-Occupational/Vocational  2454
## 12        Associate Degree-Academic Program  3257
## 13                         Bachelors Degree 12871
## 14                           Masters Degree  5720
## 15                  Professional School Deg   858
## 16                         Doctorate Degree  1191

The response is understood as follows:

31 LESS THAN 1ST GRADE 32 1ST, 2ND, 3RD OR 4TH GRADE 33 5TH OR 6TH GRADE 34 7TH OR 8TH GRADE 35 9TH GRADE 36 10TH GRADE 37 11TH GRADE 38 12TH GRADE NO DIPLOMA 39 HIGH SCHOOL GRAD-DIPLOMA OR EQUIV (GED) 40 SOME COLLEGE BUT NO DEGREE 41 ASSOCIATE DEGREE-OCCUPATIONAL/VOCATIONAL 42 ASSOCIATE DEGREE-ACADEMIC PROGRAM 43 BACHELOR’S DEGREE (EX: BA, AB, BS) 44 MASTER’S DEGREE (EX: MA, MS, MEng, MEd, MSW) 45 PROFESSIONAL SCHOOL DEG (EX: MD, DDS, DVM) 46 DOCTORATE DEGREE (EX: PhD, EdD)

##                      Less Than 1st Grade 
##                                      144 
##               1st, 2nd, 3rd Or 4th Grade 
##                                      251 
##                         5th Or 6th Grade 
##                                      446 
##                         7th Or 8th Grade 
##                                      912 
##                                9th Grade 
##                                     1241 
##                               10th Grade 
##                                     1538 
##                               11th Grade 
##                                     1625 
##                    12th Grade No Diploma 
##                                      910 
##  High School Grad-Diploma Or Equiv (Ged) 
##                                    16004 
##               Some College But No Degree 
##                                     9492 
## Associate Degree-Occupational/Vocational 
##                                     2454 
##        Associate Degree-Academic Program 
##                                     3257 
##                         Bachelors Degree 
##                                    12871 
##                           Masters Degree 
##                                     5720 
##                  Professional School Deg 
##                                      858 
##                         Doctorate Degree 
##                                     1191

Table
High Food Security Marginal Food Security Low Food Security Very Low Food Security
Not Applicable 9703 1238 1199 418
Less Than 1st Grade 95 20 17 12
1st, 2nd, 3rd Or 4th Grade 158 36 41 16
5th Or 6th Grade 276 55 82 33
7th Or 8th Grade 621 111 129 51
9th Grade 894 141 134 72
10th Grade 1094 190 167 87
11th Grade 1168 182 178 97
12th Grade No Diploma 639 106 120 45
High School Grad-Diploma Or Equiv (Ged) 12435 1575 1328 666
Some College But No Degree 7751 744 626 371
Associate Degree-Occupational/Vocational 2033 182 160 79
Associate Degree-Academic Program 2773 215 162 107
Bachelors Degree 11860 477 351 183
Masters Degree 5416 140 108 56
Professional School Deg 828 17 7 6
Doctorate Degree 1157 13 11 10
## 
##  Pearson's Chi-squared test
## 
## data:  contable_edu
## X-squared = 3136.8, df = 48, p-value < 2.2e-16

Education_Level is having a significant effect on the Food Security of People

##Hours_on_Jobs

## 'data.frame':    71472 obs. of  5 variables:
##  $ Id                : Factor w/ 27922 levels "5185410966","8178510165",..: 16600 9378 9378 8472 8472 7861 7861 19375 19375 24604 ...
##  $ Number_of_Jobs    : Factor w/ 4 levels "Not Applicable",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ Hours_on_Jobs     : num  67 43 2 43 43 62 43 2 2 2 ...
##  $ Education_Level   : Factor w/ 17 levels "Not Applicable",..: 14 15 1 14 14 10 10 7 10 5 ...
##  $ FoodSecurity_score: Factor w/ 4 levels "High Food Security",..: 1 1 1 1 1 1 1 2 2 1 ...
##    Var1  Freq
## 1     1  1921
## 2     2 37708
## 3     3    36
## 4     4    23
## 5     5    36
## 6     6    44
## 7     7    81
## 8     8   100
## 9     9    74
## 10   10    23
## 11   11   166
## 12   12    22
## 13   13   369
## 14   14    15
## 15   15   169
## 16   16    12
## 17   17    34
## 18   18   374
## 19   19   200
## 20   20    20
## 21   21    92
## 22   22     9
## 23   23  1209
## 24   24    19
## 25   25    38
## 26   26    19
## 27   27   302
## 28   28   580
## 29   29    37
## 30   30    33
## 31   31    79
## 32   32    21
## 33   33  1023
## 34   34    11
## 35   35   394
## 36   36    27
## 37   37    39
## 38   38   907
## 39   39   426
## 40   40   141
## 41   41   266
## 42   42    32
## 43   43 18496
## 44   44    11
## 45   45   159
## 46   46    64
## 47   47    82
## 48   48  1258
## 49   49    25
## 50   50    30
## 51   51   213
## 52   52    15
## 53   53  2050
## 54   54     1
## 55   55    28
## 56   56    13
## 57   57    13
## 58   58   427
## 59   59    33
## 60   60     5
## 61   61    22
## 62   62   906
## 63   63     1
## 64   64     1
## 65   65     1
## 66   66     5
## 67   67    85
## 68   68     7
## 69   69     6
## 70   70     3
## 71   71     1
## 72   72   146
## 73   73    47
## 74   74     1
## 75   75     2
## 76   76    22
## 77   77     2
## 78   78    82
## 79   79    24
## 80   80     2
## 81   81     1
## 82   82     1
## 83   83     1
## 84   84     1
## 85   85    16
## 86   86     4
## 87   87     5
## 88   88    23

## [1] 0

There are 0 whose working hours vary.

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1.00    2.00    2.00   19.58   43.00   88.00
## [1] 2

##                       Df   Sum Sq Mean Sq F value Pr(>F)    
## FoodSecurity_score     3   279623   93208   214.6 <2e-16 ***
## Residuals          71468 31045414     434                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = Hours_on_Jobs ~ FoodSecurity_score, data = food_hoj)
## 
## $FoodSecurity_score
##                                                     diff       lwr        upr
## Marginal Food Security-High Food Security     -4.2140290 -4.972646 -3.4554123
## Low Food Security-High Food Security          -5.6863555 -6.488530 -4.8841810
## Very Low Food Security-High Food Security     -6.0702228 -7.206149 -4.9342962
## Low Food Security-Marginal Food Security      -1.4723264 -2.531399 -0.4132541
## Very Low Food Security-Marginal Food Security -1.8561937 -3.186036 -0.5263518
## Very Low Food Security-Low Food Security      -0.3838673 -1.739029  0.9712947
##                                                   p adj
## Marginal Food Security-High Food Security     0.0000000
## Low Food Security-High Food Security          0.0000000
## Very Low Food Security-High Food Security     0.0000000
## Low Food Security-Marginal Food Security      0.0020125
## Very Low Food Security-Marginal Food Security 0.0019070
## Very Low Food Security-Low Food Security      0.8860333

2 - 1, 3 -1, 4 -1, 3 -2, 4 -2 have significant difference in there mean.

8657d5728a00f036f19d3ba04f8e0d67a4b3431f

-States, Family size, and Household Income

##   AL   AK   AZ   AR   CA   CO   CT   DE   DC   FL   GA   HI   ID   IL   IN   IA 
## 1207  970 1080 1258 6975  764  593  832 1208 2738 1421 1125 1304 2052 1265  884 
##   KS   KY   LA   ME   MD   MA   MI   MN   MS   MO   MT   NE   NV   NH   NJ   NM 
##  952  808 1606  564  963 1352 1762  999 1505 1099 1255  791 1007  994 1397 1242 
##   NY   NC   ND   OH   OK   OR   PA   RI   SC   SD   TN   TX   UT   VT   VA   WA 
## 2580 1503 1143 1681  985 1247 1928  647 1028  840 1452 3946 1306 1030 1301 1404 
##   WV   WI   WY 
## 1214 1092 1173

California has the highest number of respondents (6975), whereas Maine has the smallest number of respondents (564). In order to compare, I’m choosing states which has similar number of respondents. Alabama 1207 and Washington DC 1207, Florida 2738 and New York 2580, IL 2052 and PA 1928.

Plotting barcharts between all the levels of state and food security status

<<<<<<< HEAD

##                 Id States Family_Size Household_Income FoodSecurity_score
## 1  404006407110031     AL           1               16 High Food Security
## 7  147240092351000     AL           2               14 High Food Security
## 8  147240092351000     AL           2               14 High Food Security
## 16 128450301231000     AL           2               12 High Food Security
## 17 128450301231000     AL           2               12 High Food Security
## 18 114580195861000     AL           2               13 High Food Security
##    PRNMCHLD
## 1         0
## 7         1
## 8         0
## 16        0
## 17        0
## 18        0

Reference to household income: 1 LESS THAN $5,000
2 5,000 TO 7,499
3 7,500 TO 9,999
4 10,000 TO 12,499
5 12,500 TO 14,999
6 15,000 TO 19,999
7 20,000 TO 24,999
8 25,000 TO 29,999
9 30,000 TO 34,999
10 35,000 TO 39,999
11 40,000 TO 49,999
12 50,000 TO 59,999
13 60,000 TO 74,999
14 75,000 TO 99,999
15 100,000 TO 149,999
16 150,000 OR MORE

## 
##  Pearson's Chi-squared test
## 
## data:  income_t
## X-squared = 9512.9, df = 45, p-value < 2.2e-16

We can say that Household income is affecting food insecurity.

## Warning in chisq.test(family_t): Chi-squared approximation may be incorrect
## 
##  Pearson's Chi-squared test
## 
## data:  family_t
## X-squared = 1691.7, df = 39, p-value < 2.2e-16

We can say that Family size is affecting food insecurity.

As you can see from the boxplot, whenever family size bigger (more than 6 people), food insecurity is high. Also, household income has direct effect on food security. When household income is higher than 40k, food security score is low.

##     1     2     3     4     5     6     7     8     9    10    11    12    13 
##  9210 21816 12753 13724  7810  3660  1344   560   288   120    88    72    13 
##    14 
##    14

<<<<<<< HEAD

======= >>>>>>> a668936d7248af43706a630f64deb152a2ec6084

Interesting thing from this graph is that when family size ig bigger, household income is high and that family has high food security. When Family size and Household income are separate, they have significant relatioship with food security. However, when they are combined together, the result is different. For further analysis, we need to consider age and empoyment type of the family members.